Consolidate Dataclass data update methods - use DIB for update only#7483
Consolidate Dataclass data update methods - use DIB for update only#7483
Conversation
There was a problem hiding this comment.
Pull request overview
This PR consolidates DataClass and Sample update behavior around DataIteratorBuilder (DIB)-based updates, removing LSID as an update/merge key and cleaning up deprecated DataClass storage by dropping the provisioned lsid column via a module upgrade.
Changes:
- Remove LSID-as-key support for sample/data update & merge; enforce RowId or Name-based keys and ignore LSID when provided alongside valid keys.
- Introduce shared
updateRowsUsingPartitionedDIB()logic inAbstractQueryUpdateServiceand route Sample/DataClass updates through it. - Remove provisioned DataClass
lsidcolumn (domain kind + upgrade code + DB scripts) and update integration tests accordingly.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| experiment/src/org/labkey/experiment/samples/AbstractExpFolderImporter.java | Stops setting the removed UseLsidForUpdate option during TSV import. |
| experiment/src/org/labkey/experiment/api/SampleTypeUpdateServiceDI.java | Routes updates through partitioned DIB; removes LSID-key paths and legacy update methods. |
| experiment/src/org/labkey/experiment/api/ExpDataImpl.java | Selects DataClass provisioned properties by RowId (not LSID) to support dropped provisioned LSID column. |
| experiment/src/org/labkey/experiment/api/ExpDataClassDataTableImpl.java | Enforces RowId/Name keys, blocks LSID-only keying, blocks RowId in merge (behind feature), and uses partitioned DIB updates. |
| experiment/src/org/labkey/experiment/api/DataClassDomainKind.java | Removes lsid from provisioned DataClass base properties, indexes, and FKs. |
| experiment/src/org/labkey/experiment/ExperimentUpgradeCode.java | Adds deferred upgrade to drop provisioned DataClass lsid column (and related indices). |
| experiment/src/org/labkey/experiment/ExperimentModule.java | Bumps schema version to 26.005 and centralizes experimental flag constant usage. |
| experiment/src/org/labkey/experiment/ExpDataIterators.java | Updates derivation/update logic to use RowId/Name instead of LSID-for-update config; removes LSID update config usage. |
| experiment/src/client/test/integration/DataClassCrud.ispec.ts | Expands integration coverage for keying behavior (RowId/Name/LSID) and partitioned-duplicate detection. |
| experiment/resources/schemas/dbscripts/sqlserver/exp-26.004-26.005.sql | Executes Java upgrade to drop provisioned DataClass lsid column (SQL Server). |
| experiment/resources/schemas/dbscripts/postgresql/exp-26.004-26.005.sql | Executes Java upgrade to drop provisioned DataClass lsid column (PostgreSQL). |
| api/src/org/labkey/api/query/DefaultQueryUpdateService.java | Removes conversion-table hook and always uses getDbTable() for convertTypes() entrypoint. |
| api/src/org/labkey/api/query/AbstractQueryUpdateService.java | Adds updateRowsUsingPartitionedDIB() and small null-iterator safety in _pump(). |
| api/src/org/labkey/api/exp/query/ExpDataTable.java | Reorders enum constants (no functional behavior change expected). |
| api/src/org/labkey/api/exp/api/ExperimentService.java | Removes UseLsidForUpdate query option; adds shared experimental feature constant. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // using update from file, verify update using rowId for data that doesn't exist on this datacalss should fail. | ||
| errorResp = await ExperimentCRUDUtils.importData(server, "RowId\tDescription\n" + row3RowId + "\tupdate\n", emptyDataClass, "UPDATE", topFolderOptions, editorUserOptions); | ||
| expect(errorResp.text).toContain('Data not found for [' + row3RowId + ']'); |
There was a problem hiding this comment.
Typo in comment: "datacalss" should be "dataclass".
| // update date twice specifying the rowId across multiple partitions | ||
| await server.post('query', 'updateRows', { |
There was a problem hiding this comment.
Typo in comment: "update date" should be "update data".
| OptionalFeatureService.get().addExperimentalFeatureFlag(ExperimentService.EXPERIMENTAL_FEATURE_ALLOW_ROW_ID_MERGE, "Allow RowId to be accepted when merging samples or dataclass data", | ||
| "If the incoming data includes a RowId column we will allow the column but ignore it's values.", false); |
There was a problem hiding this comment.
Grammar in this feature description: "ignore it's values" should be "ignore its values".
| di = new SampleUpdateDerivationDataIterator(di, context, _container, _user, _currentDataType, _checkRequiredParents); | ||
| else | ||
| di = new DataUpdateDerivationDataIterator(di, context, _container, _user, _currentDataType, _checkRequiredParents); | ||
| di = new DataUpdateDerivationDataIterator(di, context, _container, _user, _currentDataType, _checkRequiredParents);// |
There was a problem hiding this comment.
Remove the trailing "//" at the end of this line; it looks like leftover debugging text and adds noise to the code.
| if (hasNext) | ||
| { | ||
| String lsid = (String) get(_lsidCol); | ||
| String lsid = (String) get(_lsidCol); // why lsid?, insert or merge |
There was a problem hiding this comment.
Please remove the inline comment "// why lsid?, insert or merge". It reads like a note-to-self and doesn't clarify behavior for future readers.
| for (Map<String, Object> dataRow : dataRows.values()) | ||
| lsids.add((String) dataRow.get("lsid")); | ||
| lsids.add((String) dataRow.get("lsid")); // ? |
There was a problem hiding this comment.
Remove the "// ?" comment here. If there's an open question about whether LSIDs are always present in these rows, it should be resolved in code (e.g., null-check + clear error) rather than left as an inline question.
| // update date twice specifying the name across multiple partitions | ||
| await server.post('query', 'updateRows', { |
There was a problem hiding this comment.
Typo in comment: "update date" should be "update data".
|
@XingY there's an exp-26.004-26.005.sql script in 26.3 that should be merged to develop shortly. |
OK. Thanks for the heads-up. |
|
Finding #1 — [Severity: Critical] — SQL Ambiguity — experiment/src/org/labkey/experiment/api/ExpDataClassType.java:87 The generated SQL is effectively: Why it matters: Any DataClass with attachment-type properties will cause a SQL error when the attachment parent lookup runs (e.g., during attachment integrity checks, admin pages, or file management), breaking the feature Finding #2 — [Severity: High] — Behavioral side effect — api/src/org/labkey/api/dataiterator/StatementDataIterator.java:539-544 Why it matters: This is a shared, low-level data iterator class. A behavioral change here could subtly affect SampleType updates, List updates, or any other table that goes through StatementDataIterator. Bugs would manifest as Finding #3 — [Severity: Medium] — Incorrect comment — experiment/src/org/labkey/experiment/ExperimentUpgradeCode.java:1002 Finding #4 — [Severity: Medium] — Wrong import / semantic mismatch — experiment/src/org/labkey/experiment/api/ExpDataClassDataTableImpl.java:1031,1045 Finding #5 — [Severity: Medium] — Debug artifacts left in code — multiple files
Why it matters: These look like personal notes/TODOs that should be resolved before merge. Finding #6 — [Severity: Medium] — Duplicate detection only catches first duplicate — api/src/org/labkey/api/query/AbstractQueryUpdateService.java:985-1003 Finding #7 — [Severity: Low] — Null configParameters passed to recordDataIteratorUsed — experiment/src/org/labkey/experiment/api/ExpDataClassDataTableImpl.java:1451 Finding #8 — [Severity: Low] — _tableInfo field never used — api/src/org/labkey/api/dataiterator/DataClassUpdateAddColumnsDataIterator.java:38 Phase 4: Security No security issues found. The changes maintain existing authorization patterns — updates go through the standard QueryUpdateService pipeline which checks permissions. The SimpleFilter in Phase 5: Interactions and Side Effects
Overall Assessment This is a well-structured consolidation that brings DataClass updates in line with the established SampleType DIB pattern and properly eliminates the LSID provisioned column. The most critical issue is the ambiguous Name column in |
Rationale
Related Pull Requests
Changes